Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only fetch alerts belong to the target repo and branch #3766

Merged
merged 2 commits into from
Feb 14, 2023

Conversation

huydhn
Copy link
Contributor

@huydhn huydhn commented Feb 14, 2023

This fixes a bug where the alert script will fetch and close all active alerts belonging to all repos and branches. It should only need alerts from the selected repo and branch, i.e. pytorch/pytorch and master

Also ufmt format the script to beautify it.

Testing

  • Builder (no active alert)
export REPO_TO_CHECK=pytorch/builder
export BRANCH_TO_CHECK=main
export WITH_FLAKY_TEST_ALERT=NO
export JOB_NAME_REGEX="nightly.pypi.binary.size.validation"

$ python3 torchci/scripts/check_alerts.py
Didn't find anything to alert on for pytorch/builder main
Clearing 0 alerts
  • PyTorch master (1 active alert)
export REPO_TO_CHECK=pytorch/pytorch
export BRANCH_TO_CHECK=master
export WITH_FLAKY_TEST_ALERT=YES
export JOB_NAME_REGEX=""

$ python3 torchci/scripts/check_alerts.py

[{"id": "I_kwDOFQyL-85ec5Kl", "title": "[Pytorch] There are 3 Recurrently Failing Jobs on pytorch/pytorch master", "closed": false, "number": 3763, "body": "Within the last 50 commits, there are the following failures on the master branch of pytorch: \n- [inductor / cuda11.7-py3.10-gcc7-sm86 / test (inductor_torchbench, 1, 1, linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/minihud?name_filter=inductor%20/%20cuda11.7-py3.10-gcc7-sm86%20/%20test%20%28inductor_torchbench%2C%201%2C%201%2C%20linux.g5.4xlarge.nvidia.gpu%29) failed consecutively starting with commit [4a5ce921a0934cb0cd3b1bba76973dd7270aa776](https://hud.pytorch.org/commit/pytorch/pytorch/4a5ce921a0934cb0cd3b1bba76973dd7270aa776)\n\n- [periodic / buck-build-test / buck-build-test](https://hud.pytorch.org/minihud?name_filter=periodic%20/%20buck-build-test%20/%20buck-build-test) failed consecutively starting with commit [c0e70776749f609d84ed3307ea03eb36d5570f7d](https://hud.pytorch.org/commit/pytorch/pytorch/c0e70776749f609d84ed3307ea03eb36d5570f7d)\n\n- [periodic / cuda11.7-py3.10-gcc7-sm86-periodic-dynamo-benchmarks / test (aot_eager_all, 1, 1, linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/minihud?name_filter=periodic%20/%20cuda11.7-py3.10-gcc7-sm86-periodic-dynamo-benchmarks%20/%20test%20%28aot_eager_all%2C%201%2C%201%2C%20linux.g5.4xlarge.nvidia.gpu%29) failed consecutively starting with commit [22e2fd554cf370765d4c44fe2b99c8bb6e42b0bb](https://hud.pytorch.org/commit/pytorch/pytorch/22e2fd554cf370765d4c44fe2b99c8bb6e42b0bb)\n\nPlease review the errors and revert if needed.", "createdAt": "2023-02-14T18:23:15Z", "comments": {"nodes": [{"bodyText": "These jobs started failing:\n\ninductor / cuda11.7-py3.10-gcc7-sm86 / test (inductor_torchbench, 1, 1, linux.g5.4xlarge.nvidia.gpu)", "databaseId": 1430218051}, {"bodyText": "These jobs started failing:\n\nperiodic / cuda11.7-py3.10-gcc7-sm86-periodic-dynamo-benchmarks / test (aot_eager_all, 1, 1, linux.g5.4xlarge.nvidia.gpu)", "databaseId": 1430248850}]}}]

The first green SHA was at index -1 at c0e70776749f609d84ed3307ea03eb36d5570f7dand the first red SHA was at index 1 at b7e1477e9b69a80114cbc992216cf57adf30b207
No new change. Not updating any alert for pytorch/pytorch master
Num issues with `module: flaky-tests` label:  18
No new alert for flaky tests bots.
  • PyTorch nightly (1 active alert)
export BRANCH_TO_CHECK=nightly

$ python3 torchci/scripts/check_alerts.py

[{"id": "I_kwDOFQyL-85ec5TY", "title": "[Pytorch] There are 3 Recurrently Failing Jobs on pytorch/pytorch nightly", "closed": false, "number": 3764, "body": "Within the last 50 commits, there are the following failures on the master branch of pytorch: \n- [Build Official Docker Images / build (devel, linux/amd64)](https://hud.pytorch.org/minihud?name_filter=Build%20Official%20Docker%20Images%20/%20build%20%28devel%2C%20linux/amd64%29) failed consecutively starting with commit [6cbac32dc7ec58e63cada4d209c4887fc82a9d29](https://hud.pytorch.org/commit/pytorch/pytorch/6cbac32dc7ec58e63cada4d209c4887fc82a9d29)\n\n- [Build Official Docker Images / build (runtime, linux/arm64,linux/amd64)](https://hud.pytorch.org/minihud?name_filter=Build%20Official%20Docker%20Images%20/%20build%20%28runtime%2C%20linux/arm64%2Clinux/amd64%29) failed consecutively starting with commit [6cbac32dc7ec58e63cada4d209c4887fc82a9d29](https://hud.pytorch.org/commit/pytorch/pytorch/6cbac32dc7ec58e63cada4d209c4887fc82a9d29)\n\n- [binary_ios_upload-1 / build](https://hud.pytorch.org/minihud?name_filter=binary_ios_upload-1%20/%20build) failed consecutively starting with commit [5abc365268926e6d1558dcf1db87c8aaa2359f3a](https://hud.pytorch.org/commit/pytorch/pytorch/5abc365268926e6d1558dcf1db87c8aaa2359f3a)\n\nPlease review the errors and revert if needed.", "createdAt": "2023-02-14T18:23:46Z", "comments": {"nodes": []}}]

The first green SHA was at index -1 at 5abc365268926e6d1558dcf1db87c8aaa2359f3aand the first red SHA was at index 0 at e67dc17b067c70fd238da44598c698cfa97581f6
No new change. Not updating any alert for pytorch/pytorch nightly
Num issues with `module: flaky-tests` label:  18
No new alert for flaky tests bots.

@huydhn huydhn requested review from izaitsevfb and a team February 14, 2023 19:38
@vercel
Copy link

vercel bot commented Feb 14, 2023

@huydhn is attempting to deploy a commit to the Meta Open Source Team on Vercel.

A member of the Team first needs to authorize it.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 14, 2023
@huydhn huydhn marked this pull request as ready for review February 14, 2023 20:38
@huydhn
Copy link
Contributor Author

huydhn commented Feb 14, 2023

The fix itself is ready. But I want to add one more test to cover the logic.

@izaitsevfb
Copy link
Contributor

Also ufmt format the script to beautify it.

Not super important, but one way to make a logic change + formatting easier to review is to split them into two separate commits as GH allows to review commits one by one.

Copy link
Contributor

@izaitsevfb izaitsevfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for the quick fix, Huy!

@huydhn
Copy link
Contributor Author

huydhn commented Feb 14, 2023

Also ufmt format the script to beautify it.

Not super important, but one way to make a logic change + formatting easier to review is to split them into two separate commits as GH allows to review commits one by one.

Noted. I guess I will need to keep it ufmt for this PR mainly because this has mixed in together with the fix. Undo ufmt and re-apply the fix is a pain, so I'll remember to split them next time. The most important part of the fix is in def fetch_alerts function as you have already noticed. The rest is ufmt doing its job.

@vercel
Copy link

vercel bot commented Feb 14, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
torchci ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 14, 2023 at 9:29PM (UTC)

@huydhn huydhn merged commit fea5b1c into pytorch:main Feb 14, 2023
izaitsevfb added a commit that referenced this pull request Feb 14, 2023
Reverts #3765, re-enabling the
alerts in `pytorch/builder`, since the underlying issue is fixed in
#3766.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants